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Abstract 


Ordered  binary  decision  diagrams  (OBDDs)  are  graph-based  data  structures  for  repre¬ 
senting  Boolean  functions.  They  have  found  widespread  use  in  computer-aided  design 
and  in  formal  verification  of  digital  circuits.  Minimal  trellises  are  graphical  representa¬ 
tions  of  error-correcting  codes  that  play  a  prominent  role  in  coding  theory.  This  paper 
establishes  a  close  connection  between  these  two  graphical  models,  as  follows.  Let  C 
be  a  binary  code  of  length  n,  and  let  •  •  • ,  xn)  be  the  Boolean  function  that  takes 

the  value  0  at  x\, . . . ,  xn  if  and  only  if  (#1, . . . ,  xn)  G  C.  Given  this  natural  one-to-one 
correspondence  between  Boolean  functions  and  binary  codes,  we  prove  that  the  minimal 
proper  trellis  for  a  code  C  with  minimum  distance  d  >  1  is  isomorphic  to  the  single¬ 
terminal  OBDD  for  its  Boolean  indicator  function  ■  •  •  ,  x„).  Prior  to  this  result, 

the  extensive  research  during  the  past  decade  on  binary  decision  diagrams  -  in  computer 
engineering  -  and  on  minimal  trellises  -  in  coding  theory  -  has  been  carried  out  inde¬ 
pendently.  As  outlined  in  this  work,  the  realization  that  binary  decision  diagrams  and 
minimal  trellises  are  essentially  the  same  data  structure  opens  up  a  range  of  promising 
possibilities  for  transfer  of  ideas  between  these  disciplines. 


1.  Introduction 


Algorithms  on  graphical  structures  play  a  central  role  in  both  communications  and  com¬ 
puter  engineering.  Most  modern  communications  systems  make  use  of  error-correcting 
codes  in  order  to  increase  reliability  and  manage  resources  such  as  power  and  spectrum. 
In  this  context,  trellises  [75]  and  related  graphs  [37,  38]  have  emerged  as  a  unifying  frame¬ 
work  for  understanding,  manipulating,  and  decoding  error-correcting  codes  of  all  types. 
In  computer  engineering,  ordered  binary  decision  diagrams  [14,  16]  and  their  variants 
have  found  widespread  use  for  a  range  of  applications,  including  circuit  checking,  logic 
synthesis,  and  test  generation.  Binary  decision  diagrams  are  at  the  core  of  many  tools 
for  formal  verification,  and  have  been  a  major  reason  for  recent  advances  in  this  area. 

In  this  paper,  we  show  that  there  is  a  very  close  relationship  between  trellises  and  binary 
decision  diagrams.  In  particular,  we  show  that  if  a  binary  error-correcting  code  C  has 
minimum  distance  greater  than  one,  then  the  minimal  proper  trellis  for  C  is  isomorphic 
to  the  single-terminal  ordered  binary  decision  diagram  (OBDD)  for  this  code,  viewed 
as  a  Boolean  function.  Our  proof  is  based  on  a  direct  argument  using  a  vertex- merging 
construction  of  OBDDs  due  to  Bryant  [14,  16],  along  with  some  basic  results  on  minimal 
trellises.  We  thus  establish  a  bridge  between  previously  disparate  areas  of  research  that 
makes  possible  coordinated  exploration  and  transfer  of  ideas  between  them.  One  of  our 
goals  in  this  paper  is  to  make  the  two  research  communities  aware  of  each  other. 

Prior  to  this  result,  the  historical  development  of  ideas  surrounding  OBDDs  and  trel¬ 
lises  was  independent,  yet  remarkably  parallel.  In  coding,  trellises  were  introduced  by 
Forney  [30],  and  first  used  to  represent  and  decode  block  codes  by  Bahl,  Cocke,  Jelinek, 
and  Raviv  [5].  However,  the  subject  remained  dormant  until  the  publication  of  [34,  63] 
in  1988,  that  ignited  a  flurry  of  research  during  the  past  decade.  To  date,  the  study 
of  trellises  for  block  codes  encompasses  a  sizable  body  of  results  —  a  comprehensive 
bibliography,  consisting  of  some  100  references,  may  be  found  in  the  recent  survey  [75]. 
In  a  similar  fashion,  the  idea  of  representing  Boolean  functions  as  decision  graphs  was 
recorded  in  the  early  papers  of  Lee  [57]  and  Akers  [2].  However,  their  widespread  use  as 
the  data  structure  of  choice  for  symbolic  Boolean  manipulation  started  with  the  work  of 
Bryant  [14]  in  1986,  who  formulated  a  set  of  algorithms  for  constructing  binary  decision 
diagrams,  and  operating  upon  them.  Key  to  this  algorithmic  formulation  was  the  require¬ 
ment  that  the  variables  along  every  path  from  the  root  to  a  leaf  occur  in  a  fixed  order, 
which  is  analogous  to  the  well-defined  depth  property  of  trellises.  During  the  past  decade, 
binary  decision  diagrams  have  been  a  very  active  research  topic  in  automated  logic  design 
and  verification,  and  the  subject  has  now  accumulated  a  vast  body  of  literature. 

Not  surprisingly,  the  key  results  and  the  central  research  problems  in  the  two  areas  share 
much  in  common.  A  fundamental  theorem  in  the  study  of  trellises  is  due  to  Muder  [63], 
who  showed  that  every  block  code  has  a  minimal  proper  trellis  representation,  and  any 
two  minimal  proper  trellises  for  the  same  code  are  isomorphic.  On  the  other  hand, 
Bryant  [14]  proved  that  the  OBDD  representation  of  a  Boolean  function  is  canonical:  for 
a  given  ordering  of  variables,  two  OBDDs  for  the  same  function  are  isomorphic.  We  now 
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realize  that  these  are  two  instances  of  the  same  result,  described  in  different  languages. 
A  central  problem  in  the  study  of  OBDDs  is  how  to  order  the  variables  for  a  given  func¬ 
tion  so  that  the  size  of  the  resulting  decision  diagram  is  minimized.  A  similar  problem 
for  trellises,  known  [61,  75]  as  the  art  of  trellis  decoding  or  the  permutation  problem, 
asks  how  the  time  axis  for  a  given  code  should  be  permuted  in  order  to  minimize  the 
complexity  of  the  resulting  trellis.  Once  again,  these  are  essentially  two  instances  of  the 
same  problem.  In  both  cases,  the  research  is  centered  around  techniques  for  combating 
the  exponential  growth  in  the  size  of  the  graph;  but  the  methods  that  have  been  devel¬ 
oped  are  complementary.  The  close  relationship  between  OBDDs  and  minimal  trellises 
that  we  establish  here  may  therefore  lead  to  useful  results  for  each  discipline. 

We  point  out  that  the  possibility  of  connection  between  binary  decision  diagrams  and 
trellises  was  noted  in  passing  by  Horn  and  Kschischang  [43],  who  wrote  that  “block- 
code  trellises  appear  to  be  closely  related  to  graphs  called  binary  decision  diagrams  that 
are  used  to  represent  Boolean  functions.”  However,  to  the  best  of  our  knowledge,  this 
connection  was  never  pursued  in  the  literature  beyond  the  single  sentence  quoted  above. 

The  rest  of  this  paper  is  organized  as  follows.  In  order  to  make  our  results  accessible  to 
both  communities  —  computer  engineering  and  coding  —  we  start  with  a  brief  overview 
of  the  basic  concepts  concerning  BDDs  and  trellises  in  the  next  two  sections.  These  two 
sections  also  contain  pointers  to  the  literature  on  their  respective  subjects.  In  Section  4, 
we  prove  our  main  result:  the  correspondence  between  OBDDs  and  minimal  trellises. 
Some  directions  for  transfer  of  ideas  between  the  two  areas  are  then  discussed  in  Section  5. 


2.  Binary  decision  diagrams 

Binary  decision  diagrams  are  a  graph-based  data  structure  for  representing  Boolean 
functions  [14,  16].  They  have  found  widespread  use  in  computer-aided  design  of  digital 
circuits,  and  form  the  heart  of  many  tools  for  formal  verification  [3,  21,  26].  They  are 
also  used  extensively  in  logic  synthesis  [67],  and  in  various  aspects*  of  circuit  testing  [9]. 

The  success  of  binary  decision  diagrams  has  led  to  research  efforts  on  a  number  of  fronts, 
as  surveyed  in  [18].  First,  there  have  been  many  improvements  to  the  core  technology, 
refining  the  algorithms  and  representation  techniques  for  improved  performance  [12,  40, 
64,  66].  Secondly,  a  number  of  extensions  to  the  data  structure  have  been  developed, 
leading  to  a  more  general  class  of  representations  known  as  decision  diagrams.  Some 
of  these  extensions  attempt  to  improve  the  compactness  of  representation  [7,  28],  while 


‘The  importance  and  potential  impact  of  these  methods  can  be  gauged  by  the  highly-publicized  Intel 
Pentium  floating-point  divider  bug  in  1994,  which  cost  the  company  an  estimated  $475  million.  It  has 
been  shown  [17]  that  Intel  could  have  used  ordered  binary  decision  diagrams  to  detect  and  correct  the 
erroneous  table  entries  in  the  Pentium  floating-point  divider. 
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others  extend  the  class  of  functions  that  can  be  represented  [20,  4,  23,  24,  27,  56].  Finally, 
decision  diagrams  have  been  applied  to  a  wider  range  of  tasks  in  [60]. 

In  this  section,  we  review  the  basics  of  binary  decision  diagrams,  and  in  particular  present 
the  canonical  algorithm  [14,  16]  for  building  the  OBDD  for  a  Boolean  function.  This 
algorithm  will  be  used  in  Section  4  to  construct  the  minimal  trellis  for  a  binary  code. 


2.1.  Construction  of  ordered  binary  decision  diagrams 

A  binary  decision  diagram  represents  a  Boolean  function  as  a  rooted,  directed  acyclic 
graph.  The  leaves  (vertices  of  degree  zero)  in  this  graph  are  called  terminal  vertices,  or 
simply  terminals.  The  terminals  are  labeled  0  or  1,  corresponding  to  the  possible  function 
values.  Each  nonterminal  vertex  v  is  labeled  by  a  function  variable  var(u)  and  has  two 
outgoing  edges,  corresponding  to  the  cases  where  the  variable  takes  on  the  value  0  or  1 
and  directed  towards  the  two  children  of  v,  denoted  t->o(^)  and  e—>i(v),  respectively.  For 
any  truth  assignment  to  the  variables,  the  function  value  is  determined  by  tracing  a  path 
from  the  root  to  a  terminal  vertex,  following  the  appropriate  edge  from  each  vertex. 

One  example  of  a  binary  decision  diagram  for  a  Boolean  function  f(x i, . . . ,  xn)  is  a  full 
binary  decision  tree,  which  contains  2n  terminals  and  2”  —  1  nonterminals.  This  is  illus¬ 
trated  in  Figure  la  for  the  function*  (#1  +  £2)  •  x$.  However,  binary  decision  diagrams 
are  usually  much  more  compact.  For  example,  a  smaller  BDD  for  the  same  function  is 
illustrated  in  Figure  lb,  while  Figure  lc  depicts  a  BDD  for  the  function  xi  +  X2  +  £3. 


a. 


b. 


c. 


Figure  1.  Examples  of  binary  decision  diagrams 

A  dashed,  respectively  solid,  line  indicates  the  edge  that  is  followed 
when  the  decision  variable  is  0,  respectively  1. 

To  introduce  ordered  binary  decision  diagrams,  we  impose  an  arbitrary  total  order  -<  on 
the  set  of  variables  x\, ...  ,xn.  Then  the  ordered  binary  decision  diagram  V  for  a  Boolean 
function  f(x  1, . . . ,  xn)  is  defined  by  the  following  properties:  a)  every  path  from  the  root 


*We  use  the  symbols  +,  •,  ©,  and  —  to  denote  Boolean  OR,  AND,  EXCLUSIVE-OR,  and  NOT,  respectively. 
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to  a  leaf  in  V  encounters  variables  in  ascending  order,  and  b)  it  does  not  contain  duplicate 
terminals  or  nonterminals,  or  redundant  tests  (precise  definitions  of  these  terms  follow 
in  the  next  paragraph).  For  example,  the  graphs  in  Figures  lb  and  lc  are  OBDDs,  if  we 
consider  the  variables  to  have  the  ordering  x\  -<  -<  x$. 

Bryant  [14]  proved  that  the  OBDD  representation  of  a  given  function  is  unique  —  for 
a  given  ordering,  two  OBDDs  for  the  same  function  are  isomorphic.  He  also  showed 
that  the  OBDD  for  an  arbitrary  Boolean  function  f{x\, . . .  ,xn)  can  be  constructed  by 
applying  a  set  of  reduction  rules  to  the  full  binary  decision  tree  for  f(x i, . . . ,  xn).  First, 
terminals  in  the  decision  tree  having  the  same  label  are  merged.  This  step,  known  as 
merging  duplicate  terminals,  results  in  a  directed  graph  with  only  two  terminals,  labeled 
0  and  1.  A  nonterminal  v  in  this  graph  is  said  to  be  a  redundant  test  if  ‘->0(i>)  = 
Redundant  tests  may  be  removed,  without  altering  the  function  being  represented,  by 
deleting  v  and  redirecting  all  incoming  edges  to  >o(u).  Two  nonterminals  u  and  v  are 
said  to  be  duplicate  if  M-o(u)  =  M'o  (u),  ^-h(w)  =  M-i(u),  and  var(u)  =  var(u).  Duplicate 
nonterminals  can  be  merged  by  deleting  one  of  the  two  vertices  and  redirecting  all  incom¬ 
ing  edges  to  the  other  vertex.  Again,  this  does  not  affect  the  function  being  represented. 

The  reduction  algorithm  proceeds  by  iteratively  merging  duplicate  nonterminals  and  re¬ 
moving  redundant  tests.  It  terminates  when  no  redundant  test  or  duplicate  nonterminals 
remain.  This  algorithm  is  summarized  below. 


Construction  A 

Input:  Boolean  function  f(x i, . . . ,  xn )  and  variable  ordering  X\  -<  •  •  •  -<  xn. 
Output:  Ordered  binary  decision  diagram  for  f(xi, . . . ,  xn). 

Algorithm:  Starting  with  the  full  binary  decision  tree  for  f(x i,  ...,xn),  pro¬ 
ceed  as  follows: 

Step  1.  Merge  duplicate  terminals. 

Step  2.  Merge  all  duplicate  nonterminals. 

Step  3.  Remove  all  redundant  tests. 

Iterate  steps  2  and  3  until  no  duplicate  nonterminals  or  redundant  tests  remain. 


It  is  easy  to  see  that  Construction  A  always  produces  the  unique  OBDD  for  f(x i, . . . ,  xn ). 
To  illustrate  this  construction,  consider  the  Boolean  function: 

f(x  1,X2,X3,X4,X5)  =  (a:i  ©  X2  ®  X3)  +  (X!  ©  X4)  +  (X\  ©  X2  ©  X5) 

Figure  2  shows  the  OBDD  for  this  function,  during  the  various  stages  of  its  construction: 
the  top  part  of  the  figure  depicts  the  binary  decision  tree  with  the  terminals  merged,  the 
center  shows  the  result  of  merging  duplicate  nonterminals,  and  the  bottom  part  shows 
the  BDD  obtained  after  removing  redundant  tests.  In  this  particular  example,  there  are 
no  additional  duplicate  nonterminals  generated  by  step  3,  so  the  algorithm  terminates. 
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Figure  2.  The  OBDD  for  {xl  ©x2©  x3)  +  (a*  ©  x4)  +  (Xl  ©x2  ©  x5)  with 
respect  to  the  ordering  x4  -<  x2  -<  x3  -<  x4  -<  x5 

The  OBDD  is  shown  during  the  various  stages  of  its  construction:  after  step  1  has 
been  carried  out  (top),  after  duplicate  nonterminals  have  been  merged  (center),  and 
after  redundant  tests  have  been  removed  (bottom).  Upon  completion  of  steps  1-3, 
there  are  no  additional  duplicate  nonterminals,  and  the  algorithm  terminates.  For 
clarity,  a  pair  of  parallel  edges  between  two  vertices,  is  shown  as  a  shaded  line. 
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2.2.  Operations  on  ordered  binary  decision  diagrams 

A  number  of  symbolic  operations  on  Boolean  functions  can  be  implemented  as  simple 
graph  algorithms  applied  to  their  OBDD  representations  [14,  16].  These  algorithms 
typically  have  complexities  that  are  polynomial  in  the  size  of  their  input.  We  give  just 
one  example  here.  To  describe  this  example,  we  will  need  some  more  notation  and  insight 
into  the  structure  of  ordered  binary  decision  diagrams. 

Given  a  Boolean  function  f(x i, . . . ,  xn),  the  function  fx  which  replaces  variable  x  by  the 
value  1  is  called  the  positive  cofactor  of  /  with  respect  to  x,  while  the  function  fx  that 
replaces  variable  x  by  the  value  0  is  called  the  negative  cofactor.  The  Shannon  expansion 
(originally  recognized  by  Boole  [13])  expresses  f(x i, . . . ,  xn)  as  follows: 

f  =  x-fx  +  x-fx  (1) 

Since  at  least  one  of  the  two  terms  in  the  sum  above  must  evaluate  to  zero,  this  decom¬ 
position  splits  an  arbitrary  function  into  two  mutually  exclusive  cases. 

Suppose  that  a  vertex  v  in  an  OBDD  represents  some  function  /*.  If  var(u)  =  x,  then 
^i(n)  represents  the  function  f*  and  ^-»o(u)  represents  the  function  ff.  We  postulate 
that  the  root  vertex  in  the  OBDD  for  a  Boolean  function  f(x i, . . . ,  xn)  represents  the 
function  /  itself.  Then  following  a  path  from  the  root  to  a  leaf  corresponds  to  taking  suc¬ 
cessive  cofactors  of  f(x i, . . . ,  xn)  until  it  reduces  to  a  constant.  In  other  words,  OBDDs 
are  graphical  representations  of  the  Shannon  expansion  (1)  of  a  Boolean  function. 

One  use  of  OBDDs  is  to  test  the  equivalence  of  two  logic  circuits  [14,  16].  If  the  circuits 
are  represented  as  OBDDs  corresponding  to  two  functions  /  and  g,  then  the  verification 
is  carried  out  by  computing  f®g  and  testing  whether  the  result  is  the  constant  function  0. 
This  can  be  done  efficiently  using  the  fact  that  the  cofactor  operations  distribute  through 
the  Boolean  operations;  for  example  (f  ®g)x  =  fx  ®9x •  Hence,  we  can  compute  /®  g  as 
x  •  (fx  ©  gx)  +  x  ■  (fx®  gx).  As  a  consequence,  the  verification  can  be  efficiently  carried 
out  using  a  recursive  graph  traversal  algorithm.  For  more  details  on  this  and  many  other 
applications  of  OBDDs,  we  refer  the  reader  to  the  survey  by  Bryant  [16]. 


3.  Minimal  trellises  for  block  codes 

Trellises  were  introduced  by  Forney  [30]  in  1967  as  a  conceptual  means  to  explain  the 
inner  workings  of  the  Viterbi  algorithm  [32]  for  decoding  convolutional  codes.  IBM  re¬ 
searchers  Bahl,  Cocke,  Jelinek,  and  Raviv  [5]  were  the  first  to  observe  that  linear  block 
codes  may  be  also  represented  by  a  trellis,  and  showed  how  to  construct  such  a  trellis. 
For  a  detailed  survey  of  the  trellis  theory  of  block  codes,  we  refer  the  reader  to  Vardy  [75]. 

Today,  trellises  are  used  extensively  in  the  construction  and  decoding  of  error-correcting 
codes,  where  their  applications  range  from  deep-space  communications  (trellises  were 
used  to  transmit  images  from  Mars  in  1977),  through  high-speed  modems,  to  household 
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appliances  such  as  CD  players.  Furthermore,  trellises  were  also  found  useful  in  such  ar¬ 
eas  as  channel  equalization  [31],  hidden  Markov  models  [29],  and  speech  recognition  [6]. 

In  this  section  we  present  the  definition  of  a  trellis,  and  only  briefly  touch  on  some  of 
its  properties.  We  also  define  the  minimal  proper  trellis  for  a  given  binary  code.  This 
notion  will  be  used  in  the  next  section  to  establish  the  connection  with  OBDDs. 

Loosely  speaking,  a  trellis  T  =  (V,  E,  A)  is  an  edge-labeled  directed  graph  with  the  prop¬ 
erty  that  every  vertex  in  T  has  a  well-defined  depth.  We  will  regard  each  labeled,  directed 
edge  e G F  as  an  ordered  triple  (v,v',a),  and  say  that  this  edge  begins  at  vEV,  ends 
at  v1  G  V,  and  has  label  a  G  A.  With  this  terminology,  we  have  the  following  definition. 

Definition  1.  A  trellis  T  =  (V,  E,  A)  of  depth  n  is  an  edge-labeled  directed  graph  with 
the  following  property:  the  vertex  set  V  can  be  partitioned  as 

V  =  F0  U  Vi  U  •••  U  Vn 

such  that  every  edge  in  T  that  begins  at  a  vertex  in  Vi  ends  at  a  vertex  in  V+i,  and 
every  vertex  in  T  lies  on  at  least  one  path  from  a  vertex  in  Vo  to  a  vertex  in  Vn. 

For  i  =  0, 1, . . . ,  n,  we  will  refer  to  V  as  the  set  of  vertices  at  time  i,  and  call  the  ordered 
index  set  X  =  {0, 1, . . . ,  n)  induced  by  the  partition  of  the  vertex  set  the  time  axis  for  T. 
This  temporal  terminology  is  both  natural  and  standard  [75]  in  the  study  of  trellises. 

In  most  cases  of  interest,  the  subsets  Vo,  14  C  V  each  consist  of  a  single  vertex,  called 
the  root  and  the  toor,  respectively,  and  this  will  be  assumed  in  the  remainder  of  this 
paper.  A  trellis  T  is  said  to  be  proper  if  the  edges  beginning  at  any  given  vertex  of  T  are 
labeled  distinctly.  It  is  said  to  be  co-proper  if  this  condition  holds  with  the  direction  of 
all  edges  reversed:  namely,  if  the  edges  ending  at  any  vertex  of  T  are  labeled  distinctly. 
A  trellis  T  is  said  to  be  biproper  if  it  is  both  proper  and  co-proper. 

The  set  of  binary  n-tuples  is  denoted  F£ .  For  x,  y  €  F^,  the  Hamming  distance  d(x,  y ) 
is  the  number  of  positions  where  x  and  y  differ.  An  [n,  M,  d  ]  binary  block  code  C  is 
a  subset  of  F"  of  cardinality  M,  such  that  min^gc  d(x.  y)  —  d.  The  elements  of  C  are 
called  codewords.  An  (n,  k,  d)  binary  linear  code  is  a  subspace  of  F^  of  dimension  k 
and  minimum  distance  d.  An  (n,  k,  d)  binary  linear  code  can  be  specified  either  as  the 
row-space  of  a  k  x  n  binary  generator  matrix  or  as  the  kernel  of  an  ( n—k )  x  n  binary 
parity-check  matrix.  A  block  code  C  is  said  to  be  rectangular  if  for  all  choices  of  a,  b,  c,  d, 
the  fact  that  (a,  c),  (a,  d),  (b,  c)  G  C  implies  that  (6,  d)G  C,  where  (• ,  •)  denotes  string 
concatenation.  It  is  easy  to  see  that  every  linear  code  is  rectangular,  but  not  vice  versa. 

Definition  2.  Let  T  =  ( V ,  E,  F2)  be  a  trellis  of  depth  n.  Then  the  sequence  of  edge  la¬ 
bels  along  each  path  from  the  root  to  the  toor  in  T  defines  an  ordered  binary  n-tuple. 
We  say  that  T  represents  a  binary  block  code  C  of  length  n,  or  simply  that  T  is  a  trellis 
for  C,  if  the  set  of  all  such  n-tuples  is  precisely  the  set  of  codewords  of  C. 

The  minimal  trellis  may  be  defined  in  a  number  of  different  ways  which,  in  most  cases,  are 
all  equivalent  to  the  following  definition.  We  say  that  a  trellis  T  for  a  code  C  of  length  n 
is  minimal  if  it  satisfies  the  following  property:  for  each  i  =  0, 1, . . . ,  n,  the  number  of 
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vertices  at  time  i  in  T  is  less  than  or  equal  to  the  number  of  vertices  at  time  i  in  any  other 
trellis  for  C.  Given  a  code  C,  it  is  not  at  all  obvious  that  there  exists  a  minimal  trellis 
for  C.  Although  it  is  known  [51,  75,  76]  that  such  a  trellis  exists  (and  is,  in  fact,  unique 
up  to  graph  isomorphism)  if  the  code  C  is  rectangular,  there  are  also  examples  [52]  of 
non-rectangular  codes  that  do  not  admit  a  minimal  trellis  representation.  On  the  other 
hand,  this  problem  does  not  arise  if  we  restrict  our  attention  to  proper  trellises. 

Definitions.  Let  T  be  a  proper  trellis  for  a  code  C  of  length  n.  We  say  that  T  is  the 
minimal  proper  trellis  for  C  if  it  satisfies  the  following  property:  for  each  i  =  0, 1, . . . ,  n, 
the  number  of  vertices  at  time  i  in  T  is  less  than  or  equal  to  the  number  of  vertices  at 
time  i  in  any  other  proper  trellis  for  C. 

One  of  the  fundamental  results  in  trellis  theory,  due  to  Muder  [63],  is  that  every  block 
code,  whether  it  is  rectangular  or  not,  has  a  unique  minimal  proper  trellis.  For  rectan¬ 
gular  codes  (and,  hence,  also  for  linear  codes),  it  is  known  [75]  that  the  minimal  proper 
trellis  and  the  minimal  trellis  coincide.  For  linear  codes,  the  minimal  trellis  is  sometimes 
called  the  BCJR  trellis,  after  the  authors  of  [5]  who  first  came  up  with  the  construction 
of  such  a  trellis.  We  elaborate  upon  the  BCJR  construction  in  Section  5. 


Figure  3.  Minimal  trellis  for  the  code  C  =  {00000, 11010,01101, 10111} 

There  are  several  natural  measures  of  complexity  for  a  given  trellis,  including  the  state 
complexity  s  =  max,  log  [F)|,  the  edge  complexity  \E\,  and  the  Viterbi  decoding  com¬ 
plexity  D  =  2\E\  -  \V\  +  1.  Recent  work  has  clarified  the  relationship  between  these 
parameters,  and  to  a  large  extent  they  can  be  considered  as  equivalent,  at  least  as  the 
block  length  n  gets  large.  The  minimal  trellis  uniquely  minimizes  all  of  these  complexity 
measures,  given  a  fixed  time  axis  for  the  code.  The  precise  statement  and  proof  of  this 
and  other  related  facts  is  the  subject  of  a  number  of  recent  papers  [33,  62,  73,  76]. 

As  a  simple  example,  consider  the  (5, 2, 3)  linear  code  C  =  {00000, 11010, 01101, 10111}. 
The  minimal  trellis  for  this  code  is  shown  in  Figure  3.  The  complexity  measures  for  this 
trellis,  namely  s  =  2,  \E\  —  16,  and  D  =  19,  can  be  easily  found  by  inspection. 
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4.  The  main  result:  OBDDs  and  minimal  trellises 


In  this  section,  we  rigorously  establish  the  connection  between  minimal  proper  trellises 
and  ordered  binary  decision  diagrams.  In  doing  so,  we  will  make  frequent  use  of  concepts, 
constructions,  and  theorems  discussed  in  the  foregoing  two  sections. 


First,  we  observe  that  there  is  a  natural  one-to-one  correspondence  between  Boolean 
functions  of  n  variables  and  binary  codes  of  length  n.  Let  C  be  a  such  a  code,  not  nec¬ 
essarily  linear  or  rectangular.  We  define  the  Boolean  function  /c(^i,  •  •  •  ,xn)  as  follows: 


fc(xi, 


\  def  JO  if  •  •  •  )  3?n)  £  C 

T"  1 1  otherwise 


We  call  /c(xi, . . .  ,xn)  the  indicator  function  of  C.  To  make  the  terminology  concise,  we 
will  often  refer  to  a  binary  decision  diagram  for  the  indicator  function  of  C  simply  as 
a  BDD  for  C.  Equivalently,  given  a  Boolean  function  f(x\, . . . ,  xn),  we  define  the  binary 
block  code  C /  of  length  n  as  the  set  of  all  truth  assignments  to  xi,...,xn  such  that 
f(x i, . . . ,  xn)  =  0.  Thus  Cf  is  just  the  off-set  of  /,  and  /  is  the  indicator  function  of  C /. 


Next,  we  define  the  single-terminal  OBDD  for  a  Boolean  function  /(aq ,. . . ,  xn)  by  the 
following  procedure,  analogous  to  Construction  A.  In  fact,  this  procedure  is  exactly  the 
same  as  Construction  A,  except  for  one  extra  step,  as  summarized  below. 


Construction  B 


Input:  Boolean  function  f(x i,...,xn)  and  variable  ordering  xi  -<  •  •  •  X  xn. 
Output:  Single-terminal  ordered  binary  decision  diagram  for  f(x\, . . . ,  xn). 

Algorithm:  Starting  with  the  full  binary  decision  tree  for  f(x i, . . . ,  xn),  pro¬ 
ceed  as  follows: 

Step  1.  Merge  duplicate  terminals. 

Step  X.  Prune  away  the  1-terminal. 

Step  2.  Merge  all  duplicate  nonterminals. 

Step  3.  Remove  all  redundant  tests. 

Iterate  steps  2  and  3  until  no  duplicate  nonterminals  or  redundant  tests  remain. 


Recall  that  after  merging  the  duplicate  terminals  in  step  1,  we  have  a  directed  graph  with 
exactly  two  terminal  vertices,  labeled  0  and  1.  We  then  recursively  remove  all  the  edges 
and  vertices  leading  only  to  the  terminal  labeled  1.  This  is  the  step  of  pruning  the  one- 
terminal  in  Construction  B.  Each  nonterminal  vertex  in  the  resulting  graph  has  either  one 
or  two  children.  If  a  given  vertex  v  has  only  one  child,  we  set  M-o(n)  =  0  or  =  0, 

by  convention.  With  this  convention,  the  definitions  of  redundant  tests  and  duplicate 
nonterminals  remain  as  before,  and  the  algorithm  then  continues  as  in  Construction  A. 
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The  resulting  decision  diagram  has  a  single  terminal  vertex,  corresponding  to  all  the 
sequences  that  evaluate  to  0  by  f(xu  . . . ,  xn),  or  equivalently  all  of  the  codewords  of  Cf. 
It  is  important  to  note  that  since  f(x i,...,xn)  is  binary,  this  does  not  discard  any  infor¬ 
mation,  and  the  complete  OBDD  can  be  reconstructed  from  the  single-terminal  OBDD. 

This  observation  shows  that  the  single-terminal  OBDD  can  be  also  obtained  in  a  slightly 
different  manner.  Namely,  the  operation  of  pruning  away  the  1-terminal  (step  X)  can 
be  carried  out  after  the  full  OBDD  for  f(xu  ...,xn)is  constructed.  We  will  refer  to  this 
variation  as  Construction  C.  Indeed,  it  is  not  difficult  to  show  that  the  graphs  T>B  and  T>c 
produced  by  Constructions  B  and  C,  respectively,  are  isomorphic.  Each  nonterminal 
vertex  in  these  graphs  has  out-degree  one  or  two.  In  every  instance  where  the  out-degree 
is  one,  the  missing  edge  must  correspond  to  a  sequence  that  belongs  to  the  on-set  of 
f(x i, . .  •  ,xn).  Hence,  by  first  appending  a  terminal  labeled  1,  and  then  adding  an  edge 
from  each  unary  vertex  to  this  1-terminal,  labeling  this  edge  so  that  the  resulting  graph 
is  proper,  we  obtain  a  complete  OBDD  for  f(xu  ...,xn)  from  both  VB  and  Vc.  However, 
two  complete  OBDDs  for  the  same  function  are  isomorphic,  and  hence  so  are  VB  and  Dc. 


Figure  4.  The  OBDD  and  the  single-terminal  OBDD  for  the 
function  f  in  (2),  or  equivalently  for  the  code  C  defined  by  (3) 


As  a  simple  example  for  Construction  B  (or  for  Construction  C),  consider  again  the 
Boolean  function  that  was  used  in  Section  2  to  illustrate  Construction  A,  namely: 

f(x1,X2,X3,X4,X5)  =  (oq  ©  x2  ©  x3)  +  (X!  ©  Xi)  +  (aq  ©  X2  ©  £5)  (2) 

Notice  that  f(xi,x2,  x3,  x±,  xb)  is  also  the  indicator  function  of  the  (5,2,3)  linear  code 
C  =  {00000, 11010, 01101, 10111}  used  as  an  example  in  Section  3.  This  becomes  imme¬ 
diately  clear  upon  observing  that  a  parity-check  matrix  for  C  is  given  by 


H  = 


'1110  0' 
10  0  10 
110  0  1 


(3) 


The  OBDD  and  the  single-terminal  OBDD  for  C  are  shown  in  Figure  4.  Notice  that  the 
single-terminal  OBDD  is  the  same  as  the  minimal  proper  trellis  for  C,  shown  in  Figure  3. 
Our  main  result  is  the  following  theorem,  proving  that  this  must  always  be  the  case. 
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Theorem  1.  Let  C  be  an  arbitrary  binary  code  with  minimum  distance  d  >  1.  Then 
the  single-terminal  OBDD  for  C  is  the  unique  minimal  proper  trellis  for  C. 

Proof.  It  is  easy  to  see  that  the  graph  resulting  after  steps  1  and  X  in  Construction  B 
is  a  trellis  for  C.  By  the  c->o(u)  =  and  M-i(u)  =  c-»1(u)  property  of  duplicate  non¬ 

terminals,  the  merging  procedure  in  step  2  does  not  create  any  new  paths  that  are  not 
codewords.  Furthermore,  by  the  var(v)  =  var(w)  property,  this  procedure  also  preserves 
depth.  Hence,  the  graph  resulting  after  step  2  is  still  a  trellis  for  C.  Now,  since  d  >  1, 
there  can  be  no  redundant  tests  in  any  trellis  for  C.  Thus  step  3  in  Construction  B  is 
vacuous,  and  the  single-terminal  OBDD  is  a  trellis  for  C.  Furthermore,  it  is  obvious 
that  the  outgoing  edges  of  every  vertex  in  any  binary  decision  diagram  must  be  labeled 
distinctly.  Hence  the  single-terminal  OBDD  for  C  is  a  proper  trellis  for  C. 

It  remains  to  show  that  the  single-terminal  OBDD  is  the  minimal  proper  trellis  for  C. 
To  this  end,  we  need  to  introduce  some  more  notation  and  results  from  trellis  theory. 
For  *  =  1,2,...,  n— 1,  we  define  the  projection  of  C  on  the  past  at  time  i  as  follows: 

Vi(C)  =f  |(ci,c2,...,cj)  :  (ci, . . . ,  Cj,  Cj+i,  ...,Cn)eC  for  some  ci+i,...,cneF2J 

For  each  cG'Pj(C),  we  define  the  future  of  c  as  T(c )  =  {rGFj-1  :  (c,  x)  GC},  and  say 
that  c\,  c2  G  Vi( C)  are  future-equivalent  if  T{c\)  =  JF(c2).  It  is  shown  in  [63]  that  a  proper 
trellis  T  =  (V,  E,  F2)  for  C  is  minimal  if  and  only  if  for  all*  =  1, 2, ... ,  n—  1,  the  number 
of  vertices  at  time  *  in  T  is  equal  to  the  number  of  future-equivalence  classes  defined  by 
this  relation.  From  this,  we  can  derive  an  alternative  necessary  and  sufficient  condition 
for  minimality  as  follows.  Given  a  vertex  v  G  Vi,  we  define: 

Tt{  v)  =f  [xe  :  x  is  a  sequence  of  edge  labels  along  a  path  in  T  starting  at  v  j 

Then  a  proper  trellis  T  is  minimal  if  and  only  if  for  all*  =  1, 2, ... ,  n— 1  and  for  every 
pair  of  vertices  v,v'  e  Vi,  we  have  Pt(v)  ±  FtW).  Indeed,  this  condition  implies  that 
Ci,  c2  G  Vi( C)  are  equivalent  if  and  only  if  the  paths  corresponding  to  c\  and  c2  end  at  the 
same  vertex  of  Vi.  Thus  \Vi\  must  be  equal  to  the  number  of  future-equivalence  classes. 

Now  consider  the  single-terminal  OBDD  for  C.  We  already  know  that  this  is  a  proper 
trellis  for  C.  Call  this  trellis  T  =  (V,E, F2),  and  assume  to  the  contrary  that  there 
exist  two  distinct  vertices  v,  v'  G  V,  with  Tt{v)  =  Tt(v ').  By  Construction  B,  at  least 
one  of  {M-o('y)) ' — }  or  {' — >-i  (t>) ,  £— >i(*/)}  must  be  a  pair  of  distinct  vertices,  otherwise 
v  and  v'  would  have  been  merged  as  duplicate  nonterminals.  Notice  that  we  allow  for  the 
possibility  that  some  of  ^(w))  c~h(u),  ^40(*/),  °-h(*/)  may  be  0,  which  means  that  they 
are  not  present  in  the  single-terminal  OBDD.  However,  if  one  of  ‘^•o(^0}  is  0 

then  so  is  the  other  one,  since  otherwise  Pr(v)  ±  Tt{v ')•  By  a  similar  argument,  either 
M-i(u)  =  =  0  or  both  are  present  in  the  OBDD.  Thus  we  may  assume,  w.l.o.g., 

that  u  =  °->o(*0  and  v!  =  = — >-o  (^/)  are  both  present  in  the  single-terminal  OBDD,  and 
u  ^  u' .  But  then  Et{v)  =  Et(v')  implies  that  Et{u)  —  Et(u').  Thus,  from  the  existence 
of  distinct  vertices  v,v'eVi  with  Tt{v)  =  we  have  deduced  the  existence  of 

distinct  vertices  u,  u'  G  Vi+\  with  Et{u)  =  Et(u').  Iterating  this  argument,  we  arrive  at 
a  contradiction,  since  Vn  consists  of  a  single  vertex  by  construction.  | 
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As  an  immediate  corollary  to  Theorem  1  and  the  fact  that  the  minimal  proper  trellis  is 
actually  minimal  for  rectangular  codes,  we  conclude  that  if  C  is  rectangular  and  d  >  1 
then  the  single-terminal  OBDD  for  C  is  isomorphic  to  the  unique  minimal  trellis  for  C. 

We  point  out  that  an  alternative  way  to  view  these  results  comes  from  considering  a  bi¬ 
nary  code  C  or  a  Boolean  function  fa  as  defining  a  regular  set  in  F".  As  such,  the  Myhill- 
Nerode  theorem  [42]  guarantees  the  uniqueness  of  the  minimal  deterministic  finite-state 
automaton  (DFA)  accepting  this  set.  It  follows  that  when  the  distance  of  C  is  larger 
than  one,  the  state  diagram  of  its  DFA  is  the  same  as  the  minimal  proper  trellis,  or  the 
single-terminal  OBDD.  This  viewpoint  is  briefly  mentioned  in  the  multilingual  dictionary 
of  coding,  systems  theory,  symbolic  dynamics,  and  automata  theory  [35]. 


5.  Directions  for  transfer  of  ideas 

The  connection  between  binary  decision  diagrams  and  trellises  established  in  the  previous 
section  makes  it  possible  to  translate  knowledge  accumulated  in  one  discipline  into  the 
language  of  the  other.  We  will  give  just  a  few  examples  of  this  in  what  follows.  In  light 
of  the  extensive  work  that  has  been  done  in  each  of  these  areas,  many  other  possibilities 
for  transfer  of  results  and  ideas  between  the  two  disciplines  surely  exist. 


5.1.  From  trellises  to  binary  decision  diagrams 

We  use  results  from  trellis  theory  to  analyze  a  certain  structural  property  of  binary 
decision  diagrams,  provide  lower  bounds  on  the  size  of  OBDDs,  and  derive  a  new  type 
of  decision  diagrams  that  are  often  more  compact  than  OBDDs.  We  also  comment  on 
the  complexity  of  the  variable  ordering  problem,  and  on  alternative  graphical  models  for 
Boolean  functions  that  may  follow  from  the  recent  research  in  coding  theory. 

Biproper  binary  decision  diagrams.  Let  V  be  an  ordered  binary  decision  diagram 
for  a  Boolean  function  f(x i,...,xn).  It  is  obvious  that  the  outgoing  edges  of  every 
nonterminal  vertex  in  V  must  be  labeled  distinctly.  When  is  it  that  the  incoming  edges 
of  every  nonterminal  vertex  in  V  are  also  labeled  distinctly?  The  following  proposition, 
which  follows  directly  from  Theorem  1,  provides  an  answer  to  this  question. 

Proposition  2.  Let /(ay, ... ,  xn)  be  a  Boolean  function,  and  letx1-<  -  -<xn  be  an  order¬ 
ing  of  its  variables.  If  the  corresponding  binary  code  Cf  is  rectangular  then  the  incoming 
edges  of  every  nonterminal  vertex  in  the  OBDD  for  f(xu  ...,xn)  are  labeled  distinctly. 

Proof.  It  is  known  [69,  51,  75,  76]  that  the  minimal  proper  trellis  for  a  rectangular 
code  is  biproper.  Thus  if  C /  is  a  rectangular  code  with  minimum  distance  d(C/)  >  1, 
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then  the  single-terminal  OBDD  for  f(x i, . . .  ,xn)  is  isomorphic  to  the  minimal  biproper 
trellis  for  C / ,  and  hence  all  the  incoming  edges  in  the  single-terminal  OBDD  are  labeled 
distinctly.  Since  every  nonterminal  vertex  in  the  complete  OBDD  for  f(xi, . . .  ,xn)  is 
also  a  vertex  in  the  single-terminal  OBDD,  the  proposition  follows. 

Now  assume  that  C /  is  rectangular  and  d(C/)  =  1.  Then  the  graph  resulting  after  step  2 
of  Construction  B  is  still  a  biproper  trellis  for  C / .  It  remains  to  observe  that  removing 
redundant  tests  in  a  biproper  trellis  does  not  create  duplicate  nonterminals,  and  that 
the  resulting  graph  remains  biproper.  | 

Borrowing  the  terminology  of  trellis  theory,  we  will  say  that  a  binary  decision  diagram  in 
which  the  incoming  edges  of  every  nonterminal  vertex  are  labeled  distinctly  is  biproper. 
A  biproper  single-terminal  OBDD  has  the  curious  property  that  it  can  be  used  to  evaluate 
the  function  in  two  different  ways:  either  traversing  from  top  to  bottom  —  as  is  the 
standard  practice  —  or  traversing  from  bottom  to  top.  In  other  words,  the  root  and 
the  single-terminal  are  interchangeable  in  a  biproper  single-terminal  OBDD.  This,  in 
particular,  implies  that  the  variable  orderings  x\  -<•••-<  xn  and  xn  -<■■■-<  X\  produce 
isomorphic  decision  diagrams  in  this  case. 


Figure  5.  A  biproper  OBDD  for  x3x3  +  x,i(x2x3  +  x2x3) 

Notice  that  whether  the  OBDD  for  f(xi, . . . ,  xn)  is  biproper  depends  not  only  on  the 
function  f(x i, . . . ,  xn )  itself,  but  also  on  the  ordering  of  its  variables.  Indeed,  there  exist 
codes  [69]  that  are  rectangular  for  some  orderings  of  the  time  axis  and  non-rectangular 
for  other  orderings.  Finally,  we  observe  that  the  sufficient  condition  for  biproperness 
given  in  Proposition  2  is  “almost”  necessary  as  well.  It  is  known  [51]  that  a  code  is 
rectangular  if  and  only  if  it  admits  a  biproper  trellis  representation.  Thus  a  Boolean 
function  f(x\, . . .  ,xn)  whose  off-set  Cy  has  distance  d(Cf)  >  1  can  be  represented  by 
a  biproper  OBDD  if  and  only  if  C /  is  rectangular.  However,  if  d(Cf)  —  1,  this  is  no 
longer  true  in  general.  As  an  example,  consider  the  Boolean  function: 

f(xux  2,x3)  =  x{x3  +  x1(x2x3  +  x2x3) 

whose  off-set  is  given  by  C /  =  {001, 010, 101,  111}.  This  function  has  a  biproper  OBDD, 
shown  in  Figure  5,  even  though  C /  is  not  rectangular  under  any  ordering. 
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Lower  bounds  on  the  size  of  binary  decision  diagrams.  Much  work  in  coding 
theory  [33,  47,  53,  54,  63,  65]  has  been  devoted  to  lower  bounds  on  the  size  of  the 
minimal  trellis  for  a  given  code,  under  all  possible  permutations  of  the  time  axis.  Here, 
we  translate  some  of  these  bounds  into  the  language  of  binary  decision  diagrams. 

To  this  end,  we  first  need  to  introduce  the  appropriate  notation.  Given  a  Boolean  func¬ 
tion  f(x i, . . . ,  xn),  we  let  ©o(/)  and  ©x (/)  denote  the  cardinalities  of  the  off-set  and  the 
on-set  of  /,  respectively.  Thus  0o(/)  is  just  the  number  of  codewords  in  C/.  Next,  we 
elaborate  the  notation  for  cofactors  of  f(xu  ...,£„)  that  was  introduced  in  Section  2.2. 
Given  a  fixed  string  (ax, . . . ,  am)  G  {0,  l}m  and  a  subset  {*x,  i2, . . . ,  im}  C  {1, 2, ... ,  n}, 
we  let  f\Xil,...,Xim=ai,...,am  denote  the  function  obtained  from  f(x u...,xn)  by  replacing 
the  variable  xh  by  the  value  ax,  the  variable  xl2  by  the  value  a2,  and  so  forth. 

For  each  subset  J  —  {ix,  i2,  ■  ■  ■ ,  im}  Q  {1,2,...,  n},  we  can  now  define  a  discrete  random 
variable  Xj  as  follows:  Xj  takes  on  values  in  {0,  l}m  with  probabilities  given  by: 

Pv{Xj=(au...,am)}  e°(/|x,1^’a.yQl’"'’am)  (4) 

wo  {}) 

Notice  that  for  some  values  of  ax, . . . ,  am,  the  function  /|Xjl,-,®iro=ai,-,oro  may  be  a  tau¬ 
tology,  in  which  case  Pr {Xj  =  (ax, . . . ,  am)}  =  0.  Thus  the  number  of  different  values 
that  Xj  takes  on  may  be  less  than  2m. 

We  next  recall  the  definition  of  entropy.  If  X  is  a  discrete  random  variable  taking  M 
values  with  nonzero  probabilities  pi,p2, . . .  ,pM,  the  entropy  of  X  is  given  by: 

H(X)  =  pi  log—  +  p2log—  +  •••  +  pM  log  — 

Pi  P2  Pm 

In  terms  of  the  notation  introduced  in  the  foregoing  paragraphs,  we  are  finally  ready  to 
define  the  entropy  profile  of  a  Boolean  function. 

Definition  4.  Let  f(xx, . . . ,  xn)  be  a  Boolean  function  of  n  variables.  We  define  r]t(f)  as 
the  minimum  possible  entropy  of  a  set  of  i  function  variables,  namely: 

rji{f)  =f  min  H(Xj)  for  i  =  1,2, ...  ,n 

where  the  minimum  is  taken  over  all  subsets  J  C  {1, 2, . . . ,  n}  with  \J\  =  i.  The  sequ¬ 
ence  Vi{f),r]2(f), . . . ,  rjn(f)  will  be  called  the  entropy  profile  of  f(x x, . . . ,  xn). 

A  powerful  lower  bound  on  the  size  of  the  OBDD  for  a  Boolean  function  f(x\, . . . ,  xn) 
can  be  derived  from  its  entropy  profile,  providing  d{ C/)  >  1.  Notably,  this  bound  limits 
the  size  of  the  smallest  OBDD  that  can  be  obtained  under  all  possible  orderings  of  the 
variables  xx,...,xn.  The  bound  was  proved  by  Reuven  and  Be’ery  [65]  in  the  context  of 
trellises;  it  constitutes  a  culmination  of  a  long  line  of  work  in  trellis  theory  [63,  33,  54]. 
All  we  have  done  here  is  recast  this  result  in  the  framework  of  binary  decision  diagrams, 
using  the  correspondence  between  BDDs  and  trellises  established  in  Theorem  1. 
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Theorem 3.  Let  f(x i,...,xn)  be  a  Boolean  function  such  that  d(Cf)  >  1.  Then  the 
number  of  vertices  at  level  i  in  the  OBDD  for  f(xu  ...,xn)  is  lower  bounded  by 

2 »»>•(/) .  2 Vn-df) 


where  r)i(f),  %(/),  • . . ,  Vn(f)  is  the  entropy  profile  of  f.  This  holds  for  all  i  =  1, 2, . . . ,  n 
and  for  any  total  order  on  the  support  {aq, . . . ,  xn}. 

We  believe  that  it  should  be  possible  to  extend  the  scope  of  Theorem  3  to  functions 
that  do  not  satisfy  the  requirement  d(Cf )  >  1.  One  such  extension  is  immediate.  It  is 
obvious,  by  symmetry,  that  the  same  result  holds  if  we  look  at  the  on-set  of  the  function 
rather  than  at  the  off-set,  and  replace  ©0(*)  by  0j(-)  in  equations  (4)  and  (5).  Thus 
to  apply  Theorem  3,  it  would  suffice  to  require  that  either  C /  or  its  complement  in  F” 
have  minimum  distance  greater  than  one.  Another  possible  extension  might  follow  by 
observing  that  this  requirement  essentially  ensures  that  no  redundant  tests  are  encoun¬ 
tered  in  the  construction  of  the  OBDD.  If  C/  is  a  rectangular  code  with  d(Cf  )  =  1,  then 
removing  redundant  tests  does  not  create  duplicate  nonterminals  (as  noted  in  the  proof 
of  Proposition  2).  Thus,  in  most  cases,  this  step  will  not  reduce  the  size  of  the  graph  sig¬ 
nificantly.  Exploring  to  what  extent  the  removal  of  redundant  tests  can  reduce  the  size  of 
the  OBDD  beyond  the  bound  of  (5)  would  be  an  interesting  problem  for  future  research. 


Sectionalized  decision  diagrams.  Variable  orderings  for  binary  decision  diagrams 
correspond  to  permutations  of  the  time  axis  for  binary  codes.  Indeed,  the  problem  of 
finding  the  best  variable  ordering  for  a  given  function,  or  equivalently  the  best  permu¬ 
tation  of  the  time  axis  for  a  given  code,  is  key  in  both  areas.  In  trellis  theory,  another 
operation  on  the  time  axis,  called  sectionalization,  has  been  found  useful  in  a  variety  of 
contexts.  To  the  best  of  our  knowledge,  the  counterpart  of  this  operation  for  binary 
decision  diagrams  has  not  been  investigated  previously  in  the  BDD  literature. 

In  trellis  theory,  a  sectionalization  corresponds  to  a  choice  of  the  symbol  alphabet  at  each 
time  index.  For  example,  a  binary  code  of  length  2 n  may  be  thought  of  as  a  quaternary 
code  of  length  n  if  pairs  of  consecutive  bits  are  grouped  together.  A  wide  variety  of  such 
granularity  adjustments  [36]  is  possible,  and  each  may  substantially  affect  the  number 
of  vertices,  the  number  of  edges,  and  the  overall  structure  of  the  trellis. 

The  analogous  operation  for  binary  decision  diagrams  consists  of  grouping  consecutive 
variables  together,  and  taking  non-binary  decisions  at  each  level,  based  on  the  value  of 
all  the  variables  that  correspond  to  this  level.  Let  us  illustrate  this  idea  by  an  example. 
Consider  the  following  Boolean  function: 

(Xl  ®x2  ©£3  ©£4)  +  (2:3  ©£4  ©£5  ©a^)  +  (#5  ©^6  ©£7  ©a;8)  +  (x2  ©  x3  ©  z6  ©  x7)  (6) 

The  conventional  single-terminal  OBDD  for  this  function  corresponds  to  grouping  its 
variables  into  singletons  {aq}, . . . ,  {aq}.  This  decision  diagram  is  shown  in  Figure  6a.  In¬ 
stead,  suppose  that  we  group  the  variables  into  pairs  {xux2},  {x3,x4},  {x5,x6},  {x7,x8} 
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and  take  four-way  decisions  at  each  of  the  resulting  four  levels,  depending  upon  whether 
the  value  of  the  variables  in  the  corresponding  pair  is  00,  01,  10,  or  11.  The  resulting 
singe-terminal  decision  diagram  is  shown  in  Figure  6b.  It  is  easy  to  see  that  this  di¬ 
agram  is  substantially  more  compact  than  the  conventional  OBDD,  although  we  have 
not  changed  the  order  of  the  variables  (in  fact,  this  order  is  known  [75]  to  be  optimal). 
Also  notice  that  a  complete  decision  diagram  for  the  function  f(x i, . . . ,  a*)  in  (6)  can  be 
recovered  from  Figure  6b  by  adding  28  more  edges,  in  such  a  way  that  the  out  degree  of 
each  nonterminal  vertex  becomes  4,  and  directing  all  these  edges  to  the  1-terminal. 


b. 


Figure  6.  Two  decision  diagrams  for  the  Boolean  function  in  (6) 

The  edge  labels  in  Figure  6b  correspond  to  the  values  of  the  decision 
variables  that  result  in  the  traversal  of  the  edge. 

In  general,  there  are  many  different  ways  to  sectionalize  a  given  BDD  —  that  is,  to  parse 
the  variables  xi,...,xn  into  groups:  the  number  of  distinct  parsings,  or  sectionalizations, 
of  Xi, ...  ,xn  is  about  2n~1.  The  sectionalization  problem  thus  consists  of  finding  the  op¬ 
timal  parsing  among  the  2n~ 1  possibilities.  In  contrast  to  the  variable  ordering  problem, 
which  is  known  to  be  NP-complete  for  OBDDs,  it  turns  out  that  the  sectionalization 
problem  has  a  polynomial-time  solution.  Lafourcade  and  Vardy  [55]  devised  a  section¬ 
alization  algorithm,  based  on  a  dynamic  programming  approach,  that  finds  the  optimal 
sectionalization  of  an  arbitrary  trellis  in  polynomial  time.  The  algorithm  of  [55]  works 
for  both  linear  and  nonlinear  codes,  and  easily  accommodates  a  broad  range  of  optimal¬ 
ity  criteria.  With  some  modifications,  this  algorithm  can  be  applied  to  binary  decision 
diagrams.  If  a  given  single-terminal  OBDD  represents  a  function  /  such  that  d(Cf)  >  1, 
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then  the  algorithm  of  [55]  works  as  is.  Otherwise,  one  would  need  a  slightly  more  com¬ 
plicated  book-keeping  mechanism  for  the  composition  and  amalgamation  operations  de¬ 
fined  in  [55].  We  leave  the  details  of  this  modification  for  future  work. 

For  verification  purposes,  one  of  the  most  important  properties  of  OBDDs  is  that  they 
are  canonical:  two  functions  f(x\, . . . ,  xn)  and  g(xi, . xn)  are  equal  if  and  only  if  their 
(single-terminal)  OBDDs  are  isomorphic  for  the  same  order  on  xi,...,xn.  Thus  the 
sectionalization  operation  would  be  less  useful  if  it  did  not  preserve  canonicity.  However, 
the  algorithm  of  Lafourcade  and  Vardy  [55]  can  be  easily  refined  in  such  a  way  that 
canonicity  is  preserved  under  sectionalization.  If  we  start  with  two  isomorphic  trellises 
and  sectionalize  them  using  the  algorithm  of  [55],  with  respect  to  the  same  optimality  cri¬ 
terion,  then  the  resulting  decision  diagrams  will  be  isomorphic.  The  converse  is  also  true: 
if  two  sectionalized  decision  diagrams  are  isomorphic,  they  represent  the  same  function. 


Complexity  of  the  variable  ordering  problem.  It  is  known  [10,  11,  43,  45,  74] 

that  the  variable  ordering  problem  for  binary  decision  diagrams  and  the  permutation 
problem  for  trellises  are  both  computationally  hard.  However,  the  known  NP-hardness 
results  establish  the  intractability  of  different  aspects  of  these  equivalent  problems. 

The  primary  intractability  result  in  the  OBDD  literature  is  due  to  Bollig  and  Wege¬ 
ner  [11],  who  show  that  the  following  decision  problem  is  NP-complete. 

Instance:  A  Boolean  function  f(x i, . . . ,  xn)  specified  in  terms  of  an  ordered  bi¬ 
nary  decision  diagram,  and  a  positive  size  bound  s. 

Question:  Is  there  an  ordering  of  x\, . . . ,  xn  such  that  the  corresponding  OBDD 
for  f(x i,...,xn)  has  at  most  s  vertices? 

Notice  that  an  implicit  assumption  in  this  result  is  that  f(x i, . . . ,  xn )  can  be  specified  by 
an  OBDD  whose  size  is  polynomial  in  n.  Indeed,  if  a  function  f(xi, . . . ,  xn)  is  specified 
in  terms  of  an  OBDD  with  N  =  0(2”)  vertices,  then  the  complexity  of  examining  all  n! 
possible  orderings  of  xi, . . .  ,xn  is  only  (9(iVloglog  A").  Furthermore,  the  reduction  used  in 
the  proof  of  [11]  explicitly  constructs  an  OBDD  whose  size  is  polynomial  in  n.  Thus  the 
hard  instances  of  the  foregoing  problem  are  those  functions  that  have  a  compact  OBDD 
representation.  On  the  other  hand,  it  is  known  (see  [58,  77]  and  the  discussion  in  the  next 
subsection)  that  the  fraction  of  such  functions  becomes  vanishingly  small  as  n  — >  oo. 

The  hardness  results  for  trellises  have  a  somewhat  different  flavor.  Specifically,  Horn  and 
Kschischang  [43]  prove  that  the  following  decision  problem  is  NP-complete. 

Instance:  A  binary  linear  code  C  of  length  n,  specified  by  its  parity-check  or 
generator  matrix,  a  positive  integer  i  <  n,  and  a  positive  size  bound  s. 

Question:  Is  there  a  permutation  of  the  time  axis,  such  that  the  number  of  ver¬ 
tices  at  time  i  in  the  corresponding  minimal  trellis  for  €  is  less  than  s? 

It  is  furthermore  shown  in  [74]  that  this  problem  remains  NP-complete  if  the  size  bound 
is  restricted  to  s  =  2*.  When  translated  into  the  context  of  binary  decision  diagrams, 
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using  Theorem  1,  this  implies  the  following  result.  Suppose  we  are  given  a  positive 
integer  i  <  n  and  a  Boolean  function  f(x i, . . . ,  xn)  specified  in  terms  of  a  data  structure, 
other  than  OBDD,  whose  size  is  polynomial  in  n.  Then  deciding  whether  there  exists 
an  ordering  of  xi, ...  ,xn  such  that  the  corresponding  OBDD  for  f(x i, . . . ,  xn)  has  less 
than  2*  vertices  at  level  i  is  NP-complete. 


Alternative  graphical  models  for  Boolean  functions.  In  recent  years,  a  number  of 
new  graphical  models  have  emerged  in  coding  theory,  and  evolved  into  a  far-reaching  gen¬ 
eral  framework  for  representing  a  code  by  a  graph.  In  this  context,  one  encounters  various 
generalizations  of  a  trellis,  such  as  tail-biting  trellises  [22]  and  trellis  formations  [49,  50], 
as  well  as  Tanner  graphs  [71]  that  are  in  some  sense  diametrically  opposite  to  trel¬ 
lises.  All  these  representations  are  special  cases  of  the  general  concept  of  a  factor  graph. 
We  refer  the  reader  to  [1,  37,  38,  79]  for  a  detailed  treatment  of  factor  graphs  and  the 
associated  iterative  manipulation  algorithms:  the  min-sum  and  the  sum-product. 

The  success  of  these  graphical  models  in  coding  theory  and  communications  has  been 
spectacular.  For  example,  tail-biting  trellis  representations  have  been  found  in  [22,  48]  for 
several  well-known  codes,  whose  complexity  is  the  square  root  of  the  lowest  complexity 
achievable  with  the  conventional  minimal  trellis.  On  a  grander  scale,  turbo  codes  [8] 
represented  by  a  factor  graph  and  decoded  with  an  iterative  sum-product  algorithm  have 
been  shown  to  approach  channel  capacity  with  feasible  complexity,  a  goal  that  eluded 
the  research  community  for  almost  50  years.  More  recently,  similar  results  have  been 
established  [59,  70]  for  low-density  parity-check  codes,  represented  by  a  Tanner  graph. 

It  remains  to  be  seen  whether  any  of  the  graphical  models  mentioned  in  the  foregoing 
paragraphs  can  be  used  to  efficiently  represent  Boolean  functions  in  the  context  of  logic 
synthesis  and  verification.  As  an  example,  consider  the  well-known  hidden  weighted  bit 
Boolean  function,  defined  by 


/hw(zi,...,Zn)  =f 


(° 


if  wt(a:)  =  0 
if  wt(:r)  >  0 


where  wt(.r)  is  the  number  of  non-zeros  in  (aq,  x2, . .  ■ ,  xn).  Bryant  [15]  proved  that  any 
OBDD  representation  of  this  function  requires  at  least  fl(1.14n)  vertices,  yet  there  exists 
an  alternative  implementation  of  fhw(x  u  ...,xn)  with  area-time  complexity  of  0(n1+£). 
We  point  out  here  that  this  alternative  implementation  is  essentially  a  factor-graph 
implementation  [49].  We  refrain  from  pursuing  this  any  further  in  this  paper.  However, 
we  believe  that  this  line  of  research  holds  great  promise. 


5.2.  From  binary  decision  diagrams  to  trellises 

Since  1986,  when  ordered  binary  decision  diagrams  were  introduced  for  verification  prob¬ 
lems  [14],  many  refinements  and  variations  of  the  basic  data  structures  and  algorithms 
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have  been  proposed.  Here,  we  discuss  how  these  and  other  results  pertaining  to  binary 
decision  diagrams  may  be  applied  to  trellises. 


Almost  all  codes  have  exponential  trellis  complexity.  It  is  known  [39,  58,  77] 
that  almost  all  n- variable  Boolean  functions  cannot  be  represented  by  an  OBDD  with 
less  than  2n/2 n  vertices,  regardless  of  the  variable  ordering.  More  precisely,  Liaw  and 
Lin  [58]  establish*  the  following  result.  Let  u>(n)  =  22"  be  the  total  number  of  n- variable 
Boolean  functions,  or  equivalently  binary  codes  of  length  n,  and  let  7 (n)  denote  the 
number  of  n- variable  functions  whose  OBDD,  under  optimal  variable  order,  has  less 
than  2n/2n  vertices.  It  is  shown  in  [58]  that 


lim^i 

n-> 00  uj(n) 


=  0 


(7) 


We  know  from  Theorem  1  that  the  minimal  proper  trellis  for  C  has  at  least  as  many 
vertices  as  the  OBDD  for  /c.  Thus  the  result  of  (7)  transfers  directly  to  minimal  proper 
trellises.  This  was  not  previously  known  in  the  trellis  literature.  It  is  known  that  a 11 
asymptotically  good  codes  have  exponential  trellis  complexity  [53],  and  almost  all  linear 
codes  are  asymptotically  good  [72,  p.77].  However,  it  is  not  difficult  to  see  that  almost 
all  nonlinear  codes  are  not  asymptotically  good. 

Liaw  and  Lin  [58]  also  consider  quasi-reduced  OBDDs,  obtained  by  applying  only  the 
merging  rule  (step  2  in  Construction  A)  and  not  the  redundant-tests  deletion  rule  (step  3 
in  Construction  A) .  It  is  obvious  from  Theorem  1  that  a  quasi-reduced  OBDD  for  a  func¬ 
tion  /  is  precisely  the  minimal  proper  trellis  for  C /,  whether  d(Cf)  >  1  or  d(C/)  =  1. 
Asymptotically  as  n  — >  00,  Liaw  and  Lin  [58]  observe  that  for  virtually  all  Boolean  func¬ 
tions,  the  merging  rule  contributes  a  factor  of  1/n  to  the  overall  reduction  in  the  size  of 
the  OBDD,  whereas  the  redundant-test  deletion  rule  contributes  only  a  constant  factor. 
For  fixed  n,  Liaw  and  Lin  [58]  find  empirically  that  the  merging  rule  alone  accounts  for 
over  99%  of  the  average  reduction  in  the  size  of  the  OBDD,  whenever  n  >  15.  They  thus 
suggest  that  under  certain  circumstances,  it  is  more  advantageous  to  use  quasi-reduced 
OBDDs  (namely,  trellises!),  since  then  the  level-index  field  can  be  eliminated  from  the 
vertex  record,  resulting  in  more  significant  savings  in  the  overall  storage  space  than  those 
obtained  by  the  redundant-tests  deletion  rule. 

Liaw  and  Lin  [58]  also  show  that  for  all  n- variable  Boolean  functions,  the  quasi-reduced 
OBDD  has  at  most  (2+e)(2n/n)  vertices  for  all  sufficiently  large  n,  regardless  of  the  vari¬ 
able  ordering.  Clearly,  this  bound  transfers  directly  to  trellises.  An  intersting  conclusion 
from  this  result,  in  conjunction  with  (7),  is  that  the  complexity  of  the  minimal  proper 
trellis  for  almost  all  binary  codes  is  not  sensitive  to  permutations  of  the  time  axis:  the 
trellis  has  at  least  2n/2n  vertices  for  the  best  possible  permutation,  and  at  most  4  times 
as  many  vertices  for  the  worst  possible  permutation.  This  insensitivity  phenomenon, 
well-known  [39,  58,  77]  in  the  OBDD  literature,  was  not  previously  observed  for  trellises. 


*A  similar  result  was  established  by  Shannon  [68]  in  the  context  of  two-terminal  contact  networks. 
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Multi-terminal  trellises/syndrome  decision  diagrams.  Multi-terminal  binary 
decision  diagrams  [23]  are  extensions  of  BDDs  for  representing  functions  /  :  {0,  l}n  S, 
where  S  is  any  finite  set.  A  multi-terminal  BDD  differs  from  a  conventional  OBDD  only 
in  that  it  may  have  multiple  terminals,  rather  than  two  terminals  labeled  0  and  1. 

The  notion  of  multi-terminal  BDDs  can  be  exploited  to  construct  a  multi-terminal  trellis 
that  simultaneously  represents  a  binary  linear  code  C  as  well  as  all  the  cosets  of  C,  in  F2n 
or  in  a  given  subspace  of  F2n.  Multi-terminal  trellises  were  used  by  Ytrehus  [41,  80]  to 
represent  the  parallel  branch  codes  encountered  in  the  decoding  of  partial  unit  memory 
convolutional  codes.  In  general,  such  trellises  are  useful  whenever  one  needs  to  decode 
a  partition  of  a  given  space  into  cosets  of  a  given  code.  This  task  is  at  the  core  of  the 
coset-decoding  technique  [25]  and  is  frequently  encountered  in  multilevel  coding  [34,  44]. 

Another  application  of  multi-terminal  trellises  is  as  an  attractive  alternative  to  the  well- 
known  standard  array  decoding  technique,  which  we  now  briefly  describe.  Let  C  be  an 
(n,  k,  d)  binary  linear  code,  and  let  H  —  \h\ , . . . ,  hn]  be  a  parity-check  matrix  for  C. 
The  standard  array  for  C  is  the  2n~kx  2k  matrix  with  entries  from  F2  ,  having  the  cosets 
of  C  in  F2  as  its  rows.  For  each  coset,  we  may  pre-compute  the  coset  leader  v,  defined 
as  the  vector  of  minimal  Hamming  weight  in  the  coset.  For  each  x  G  F",  the  syndrome 
of  a;  with  respect  to  H  is  defined  as  Hxt.  Given  the  channel  output  y  G  F2  ,  we  first 
compute  the  syndrome  s  =  Hyt,  and  then  decode  y  to  c  =  y-v  eC,  where  v  is  the  coset 
leader  of  the  coset  consisting  of  all  the  vectors  whose  syndrome  with  respect  to  H  is  s. 
This  procedure,  known  as  standard  array  decoding  and  illustrated  in  Figure  7,  achieves 
hard-decision  maximum-likelihood  decoding  of  C  on  a  binary  symmetric  channel. 


Figure  7.  Standard  array  decoding 

With  a  multi-terminal  trellis,  we  can  represent  the  standard  array  compactly,  avoid  the 
brute-force  enumeration  of  cosets,  and  obtain  a  linear  time  procedure  for  both  syndrome 
computation  and  decoding.  The  idea  is  to  construct  a  multi-terminal  BDD  for  the  func¬ 
tion  ht^{x x, . . .  ,xn)  =  Hxt,  using  a  procedure  analogous  to  the  BCJR  construction  [5], 
In  addition,  we  will  carry  out  dynamic  programming  during  the  construction  to  label 
each  vertex  by  the  minimum  weight  path  that  leads  to  it. 

The  BCJR  trellis  T  =  (V,  E,  F2)  for  a  linear  code  C  of  length  n  is  constructed  [5]  by 
identifying  vertices  with  partial  codeword  syndromes: 

Vi  =  |  Cl  hi  4 - \-dhi  :  (ci, ...,  cn)  G  C  for  some  Cj+1, ...  ,cn  G  F2  j  (8) 

with  V0  =  0  by  convention.  There  is  an  edge  e  G  £*  from  v  G  to  u  €  V  if  and  only 

if  there  exists  a  codeword  (cx,  c2, . . . ,  c»)  G  C  such  that  cxhx  + b  d-ihi-i  =  v  and 
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Figure  8.  The  syndrome  decision  diagram  for  C  =  {00000, 11010, 01101, 10111} 

The  terminal  corresponding  to  y  G  F^1  is  labeled  by  the  syndrome  Hyf,  and  the 
arrows  indicate  paths  to  be  taken  to  obtain  a  minimum  weight  error  vector.  The 
syndrome  may  be  calculated  by  trickling  the  received  vector  down  the  diagram. 
For  example,  if  the  vector  y  =  (10011)  is  received,  the  corresponding  path  from  the 
root  ends  in  the  terminal  labeled  Hyt  =  (100)*.  Following  the  backpointers  from 
this  terminal,  the  error  vector  is  determined  to  be  (00100). 


C\hi  H - (-  +  Cihi  —  u.  The  label  of  this  edge  is  a(e)  =  c*.  The  multi-terminal 

trellis  V  =  (V7,  E',  F2)  may  be  constructed  in  a  manner  analogous  to  (8)  by  computing 
partial  syndromes  for  all  vectors  in  F2n,  not  just  the  codewords  of  C.  Thus,  we  define 

V-  =f  jzi/iiH - f  Xihi  :  (zi, . . .  ,xn)  eF2n  for  some  xi+1, . . .  ,xn  eF2  j  (9) 

with  Vo  =  0  by  convention.  There  is  an  edge  e  G  E[  from  v  G  V(_Y  to  u  G  V(  with  label 
a(e)  =  x  G  F2  if  and  only  if  u  =  v  +  xhi.  It  is  easy  to  see  that  the  resulting  graph  V  is 
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the  multi-terminal  BDD  for  the  function  hc(xi, . . .  ,xn )  =  Hx t.  Note  that  the  minimal 
trellis  for  C  is  contained  in  T>  as  a  proper  subgraph.  Also  notice  that  by  replacing  IF” 
in  (9)  by  an  arbitrary  subspace  S  such  that  C  C  S  C  F2",  we  obtain  the  multi-terminal 
trellis  that  represents  the  cosets  of  C  in  that  subspace. 

By  carrying  out  a  simple  dynamic  programming  algorithm  on  V  during  its  construction, 
maintaining  for  each  vertex  the  minimum  weight  path  reaching  that  vertex  and  a  corre¬ 
sponding  pointer  back  to  the  previous  level,  we  can  determine  the  minimum  weight  path 
to  every  vertex  in  T>,  and  therefore  to  every  syndrome.  The  straightforward  details  are 
omitted.  The  resulting  data  structure,  which  we  call  the  syndrome  decision  diagram  for  C, 
is  illustrated  in  Figure  8  for  the  (5, 2, 3)  linear  code  C  =  {00000, 11010, 01101, 10111}. 

Given  a  syndrome  decision  diagram  X>,  a  maximum-likelihood  decoder  for  C  can  be 
implemented  as  follows.  First  we  evaluate  the  received  vector  y  in  'D.  thus  computing 
the  function  hc{yi,...,  yn)  —  Hy1  which  gives  the  syndrome  of  y,  and  then  trace  back 
from  the  corresponding  terminal  to  find  a  coset  leader  v  in  the  coset  of  y. 

The  standard-array  decoding  procedure,  illustrated  in  Figure  7,  has  space  complexity 
0(2”  k)  and  decoding  complexity  0(n2),  since  the  computation  of  the  syndrome  s  =  Hyt 
is  in  general  quadratic  in  the  block  length.  Construction  of  the  syndrome  decision  dia¬ 
gram  requires  0(n2n~k)  space  and  time  complexity.  However,  once  the  diagram  is  avail¬ 
able,  both  syndrome  computation  and  decoding  can  be  accomplished  in  linear  time. 

In  general,  as  a  computational  device  that  computes  syndromes  in  linear  time,  syndrome 
decision  diagrams  would  be  useful  in  many  different  contexts  in  coding  theory. 


Reed-Muller  expansions  and  OFDDs.  One  approach  to  obtaining  more  compact 
representations  of  Boolean  functions  has  been  to  change  the  interpretation  of  the  vertices 
within  the  data  structure.  As  discussed  in  Section  2,  OBDDs  represent  the  Shannon 
,  expansion  (1)  of  a  Boolean  function.  An  alternative  expansion  of  a  Boolean  function  can 
be  expressed  in  terms  of  the  exclusive-OR  operation: 

f  fx  ®  X  "  fix  fx®  X  •  fix  (10) 

where  fgx  =  fx  ®  fx  is  the  Boolean  difference  of  /  with  respect  to  x.  The  first  equality 
in  (10)  is  known  as  either  the  Reed-Muller  or  the  negative  Davio  expansion,  while  the 
second  equality  is  referred  to  as  the  positive  Davio  expansion.  These  decompositions  are 
analogous  to  the  Taylor  expansion  of  a  differentiable  function. 

The  Reed-Muller  expansion  can  be  used  [46]  as  the  basis  for  graphical  representations 
called  ordered  functional  decision  diagrams.  This  representation  is  analogous  to  OBDDs, 
except  that  the  outgoing  edges  from  a  vertex  represent  the  negative  cofactor  and  Boolean 
difference  of  the  function  with  respect  to  the  vertex  variable.  Ordered  functional  decision 
diagrams  (OFDDs)  have  many  properties  in  common  with  OBDDs.  For  example,  the 
representation  is  canonical,  and  can  be  constructed  using  a  similar  algorithm  for  merging 
and  eliminating  vertices,  with  a  different  reduction  rule  for  removing  redundant  tests. 
The  OFDD  for  our  example  code  C  =  {00000, 11010, 01101, 10111}  is  shown  in  Figure  9. 
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Figure  9.  The  OFDD  and  the  OBDD  for  C  =  {00000, 11010, 01101, 10111} 

For  some  classes  of  functions,  OFDDs  are  exponentially  smaller  than  the  corresponding 
OBDDs,  although  the  reverse  can  also  hold.  One  interesting  direction  for  future  re¬ 
search  would  be  to  explore  an  application  of  OFDDs  for  channel  coding,  by  investigating 
decoding  algorithms  based  on  this  representation. 

One  useful  decoding  algorithm  for  trellises  is  the  forward-backward  algorithm,  also  called 
the  BCJR  algorithm  after  the  authors  of  [5],  who  first  developed  this  algorithm  in  the 
trellis  context.  The  forward-backward  algorithm  is  widely  used  in  practice,  for  example 
in  the  decoding  of  turbo  codes  [8],  to  obtain  maximum  a  posteriori  likelihood  (MAP) 
decoding  of  each  code  symbol.  The  complexity  of  this  algorithm  is  polynomial  in  the  size 
of  the  trellis,  or  equivalently  the  size  of  the  single-terminal  OBDD  for  the  code.  However, 
we  will  now  show  that  it  is  unlikely  that  the  calculations  required  in  the  forward-backward 
algorithm  can  be  carried  out  efficiently  using  the  OFDD  representation. 

The  forward-backward  algorithm  assumes  knowledge  of  the  channel  transition  probabil¬ 
ity  function  p(y |rr),  where  x  €  C  is  the  channel  input  and  y  is  the  vector  observed  at  the 
channel  output.  For  a  given  binary  code  C,  the  algorithm  effectively  computes 

So (xi)  =  -^2p(y\x)  and  Si(a*)  =f  ^p(y\x)  (11) 

xeC 

**=  0  X{=1 

for  all  i  =  1,2 ,  and  decodes  the  i-th  code  bit  Xi  to  either  0  or  1,  according  as 

So(%i)  >  Si(xi)  or  >  So(xi).  Although  the  formulation  of  the  forward-backward 

algorithm  in  (11)  is  general,  we  will  restrict  our  attention  to  the  simplest  possible  channel 
model:  the  binary  symmetric  channel  with  cross-over  probability  6.  Thus  the  channel 
output  is  binary,  and  the  transition  probability  function  is  given  by: 

p9(y\x)  =  6d{x’y)  (1  -  d)n~d^y)  (12) 

where  d(x,  y)  is  the  Hamming  distance  and  6  G  [0, 1]  is  a  real  constant.  The  decoding 
algorithm  that  we  seek  must  work  for  any  6,  thought  of  as  a  parameter  of  the  algorithm. 
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Proposition  4.  Let  C  be  an  arbitrary  binary  block  code,  and  let  T  be  an  OFDD  for  C. 
Then  there  is  no  polynomial-time  algorithm  in  the  size  of  T  for  computing  the  expres¬ 
sions  So(xi)  and  <Si {xf)  in  (11)  for  the  function  pe(y\x)  in  (12),  unless  P  =  NP. 

Proof.  The  key  idea  of  the  proof  is  to  observe  that  on  a  binary  symmetric  channel 
with  9  =  0.5,  the  forward-backward  algorithm  simply  counts  the  number  of  codewords 
that  have  0,  respectively  1,  in  the  specified  position.  Indeed,  for  6  —  0.5,  we  have 

Soto) +  <$!(**)  =  Y  (O-5)^  (O^)”-^  +  Y  (0.5)^  (0.5)n-d(X)!/)  =  fi 

xi=°  Xi~l 

Thus,  as  a  special  case,  such  an  algorithm  could  be  used  to  compute  the  size  of  the 
code.  It  is  shown  in  [78],  however,  that  the  problem  of  computing  |C/|  using  the  OFDD 
representation  of  a  Boolean  function  f(x i, . . . ,  xn)  is  #P-complete.  | 

We  conclude  that  OFDDs  are  not  suitable  for  the  kind  of  calculations  required  in  the 
forward-backward  algorithm,  at  least  for  general  binary  codes.  It  is  still  possible  that 
OFDDs  can  be  used  efficiently  in  the  context  of  the  forward-backward  algorithm  in  the 
special  case  of  linear  codes.  It  is  also  possible  that  maximum-likelihood  decoding,  as  op¬ 
posed  to  symbol-by-symbol  MAP  decoding,  can  be  implemented  efficiently  with  OFDDs. 

Binary  moment  diagrams.  There  have  been  several  efforts  to  extend  the  concept 
of  BDDs  to  represent  functions  over  Boolean  variables,  but  having  non-Boolean  ranges, 
such  as  the  integers  or  the  real  numbers. 


Figure  10.  The  BMD  for  a  linear  code  with  parity-check  matrix  H  =  [/q,  /i2, . . . ,  hn] 

One  approach  to  representing  numeric  functions,  especially  those  encountered  in  arith¬ 
metic  circuit  verification,  involves  changing  the  function  decomposition  with  respect  to 
its  variables,  in  a  manner  analogous  to  the  use  of  Reed-Muller  expansions  for  FDDs. 
The  moment  decomposition  of  a  function  is  obtained  as 

/  =  fw  +  x-  (fx  —  fx)  =  fx  +  x  ■  fdx 

where  fdx  =  fx  —  fx  is  called  the  linear  moment  of  /  with  respect  to  x.  The  resulting 
representation  is  known  [20]  as  the  binary  moment  diagram  or  BMD. 
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Our  conclusion  with  regard  to  binary  moment  diagrams  is,  again,  negative.  As  pointed 
out  to  us  by  Randy  Bryant  [19],  this  representation  turns  out  not  to  be  useful  for  codes. 
Indeed,  let  C  be  an  (n,  k ,  d)  linear  code  with  parity-check  matrix  H  —  [hi,  /i2,  •  • . ,  hn]. 
Then  the  binary  moment  diagram  of  the  F” -fe-valued  function  hc{x i, . . . ,  xn)  =  Hx1  is 
the  tree  shown  in  Figure  10.  Using  the  fact  that  the  BMD  representation  is  canonical  [20], 
this  statement  follows  by  a  simple  induction  on  the  block  length  of  the  code. 


6.  Conclusions  and  discussion 

We  have  established  a  correspondence  between  ordered  binary  decision  diagrams  and 
minimal  trellises,  proving  that  the  single-terminal  OBDD  for  a  binary  code  C,  viewed  as 
a  Boolean  function,  is  isomorphic  to  the  minimal  proper  trellis  for  C,  provided  d(C)  >  1. 

Although  we  have  emphasized  the  similarities  between  the  two  data  structures  through¬ 
out  this  paper,  one  should  also  be  aware  of  the  differences  between  them.  It  appears 
that  the  major  distinction  between  trellises  and  OBDDs  results  from  the  elimination  of 
redundant  tests,  which  does  not  preserve  the  depth  structure  of  a  trellis.  This  distinc¬ 
tion  becomes  vacuous  if  d( C)  >  1.  The  restriction  d(C)  >  1  does  not  have  much  of  an 
impact  in  coding  theory:  any  useful  code  will  have  minimum  distance  greater  than  1. 
However,  there  is  no  reason  why  the  off-set  of  a  useful  Boolean  function  should  satisfy 
this  requirement.  Thus  every  reasonable  trellis  is  an  OBDD,  but  not  vice  versa. 

Another  significant  distinction  between  the  theory  of  binary  decision  diagrams  and  trellis 
theory  stems  from  a  difference  in  emphasis.  While  most  of  the  research  in  channel  coding 
is  focused  on  linear  codes,  the  corresponding  class  of  Boolean  functions  has  not  received 
much  attention  in  logic  synthesis  and  formal  verification. 

Despite  the  dissimilarities  discussed  above,  we  have  demonstrated  that  the  connection 
between  trellises  and  OBDDs  opens  up  many  possibilities  for  leveraging  the  extensive 
work  that  has  been  carried  out  independently  in  two  previously  unconnected  disciplines. 
We  hope  that  this  paper  will  stimulate  further  research  in  this  direction. 
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